NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Practical Considerations of Fully Homomorphic Encryption in Privacy-Preserving Machine Learning

https://doi.org/10.1109/bigdata62323.2024.10825068

Lo, Dan Chia-Tien; Shi, Yong; Shahriar, Hossain; Deng, Bobin; Zhang, Xinyue; Chen, Mei-Lan (December 2024, IEEE)

Machine learning has been successfully applied to big data analytics across various disciplines. However, as data is collected from diverse sectors, much of it is private and confidential. At the same time, one of the major challenges in machine learning is the slow training speed of large models, which often requires high-performance servers or cloud services. To protect data privacy while still allowing model training on such servers, privacy-preserving machine learning using Fully Homomorphic Encryption (FHE) has gained significant attention. However, its widespread adoption is hindered by performance degradation. This paper presents our experiments on training models over encrypted data using FHE. The results show that while FHE ensures privacy, it can significantly degrade performance, requiring complex tuning to optimize.
more » « less
Full Text Available
The Robustness of Spiking Neural Networks in Communication and its Application towards Network Efficiency in Federated Learning

https://doi.org/10.1109/IPCCC59868.2024.10850123

Nguyen, Manh V; Zhao, Liang; Deng, Bobin; Severa, William; Xu, Honghui; Wu, Shaoen (November 2024, IEEE)

Full Text Available
Characterizing and Understanding the Performance of Small Language Models on Edge Devices

Islam, Romyull; Dhar, Nobel; Deng, Bobin; Nguyen, Tu N; He, Selena; Suo, Kun (October 2024, IEEE International Performance Computing and Communications Conference)

Full Text Available
Evaluation of Thermal Stress on IoT-based Federated Learning

https://doi.org/10.1145/3603287.3651222

Gu, Yi; Zhao, Liang; Deng, Bobin; Wu, Shaoen (April 2024, ACM)

Full Text Available
Fixed-point Encoding and Architecture Exploration for Residue Number Systems

https://doi.org/10.1145/3664923

Deng, Bobin; Nadendla, Bhargava; Suo, Kun; Xie, Yixin; Lo, Dan Chia-Tien (May 2024, ACM Transactions on Architecture and Code Optimization)

Residue Number Systems (RNS) demonstrate the fascinating potential to serve integer addition/multiplication-intensive applications. The complexity of Artificial Intelligence (AI) models has grown enormously in recent years. From a computer system’s perspective, ensuring the training of these large-scale AI models within an adequate time and energy consumption has become a big concern. Matrix multiplication is a dominant subroutine in many prevailing AI models, with an addition/multiplication-intensive attribute. However, the data type of matrix multiplication within machine learning training typically requires real numbers, which indicates that RNS benefits for integer applications cannot be directly gained by AI training. The state-of-the-art RNS real number encodings, including floating-point and fixed-point, have defects and can be further enhanced. To transform default RNS benefits to the efficiency of large-scale AI training, we propose a low-cost and high-accuracy RNS fixed-point representation: Single RNS Logical Partition (S-RNS-Logic-P) representation with Scaling Down Postprocessing Multiplication (SD-Post-Mul). Moreover, we extend the implementation details of the other two RNS fixed-point methods: Double RNS Concatenation (D-RNS-Concat) and Single RNS Logical Partition (S-RNS-Logic-P) representation with Scaling Down Preprocessing Multiplication (SD-Pre-Mul). We also design the architectures of these three fixed-point multipliers. In empirical experiments, our S-RNS-Logic-P representation with SD-Post-Mul method achieves less latency and energy overhead while maintaining good accuracy. Furthermore, this method can easily extend to the Redundant Residue Number System (RRNS) to raise the efficiency of error-tolerant domains, such as improving the error correction efficiency of quantum computing.
more » « less
Full Text Available
Building a Resilient and Sustainable Grid: A Study of Challenges and Opportunities in AI for Smart Virtual Power Plants

https://doi.org/10.1145/3603287.3651202

Islam, Md Romyull; Vu, Long; Dhar, Nobel; Deng, Bobin; Suo, Kun (April 2024, ACM)

In recent years, integrating distributed energy resources has emerged as a pervasive trend in competitive energy markets. The idea of virtual power plants (VPPs) has gained traction among researchers and startups, offering a solution to address diverse social, economic, and environmental requirements. A VPP comprises interconnected distributed energy resources collaborating to optimize operations and participate in energy markets. However, existing VPPs confront numerous challenges, including the unpredictability of renewable energy sources, the intricacies and fluctuations of energy markets, and concerns related to insecure communication and data transmission. This article comprehensively reviews the concept, historical development, evolution, and components of VPPs. It delves into the various issues and challenges encountered by current VPPs. Furthermore, the article explores the potential of artificial intelligence (AI) in mitigating these challenges, investigating how AI can enhance the performance, efficiency, and sustainability of future smart VPPs.
more » « less
Full Text Available
An Empirical Analysis and Resource Footprint Study of Deploying Large Language Models on Edge Devices

https://doi.org/10.1145/3603287.3651205

Dhar, Nobel; Deng, Bobin; Lo, Dan; Wu, Xiaofeng; Zhao, Liang; Suo, Kun (April 2024, ACM)

The success of ChatGPT is reshaping the landscape of the entire IT industry. The large language model (LLM) powering ChatGPT is experiencing rapid development, marked by enhanced features, improved accuracy, and reduced latency. Due to the execution overhead of LLMs, prevailing commercial LLM products typically manage user queries on remote servers. However, the escalating volume of user queries and the growing complexity of LLMs have led to servers becoming bottlenecks, compromising the quality of service (QoS). To address this challenge, a potential solution is to shift LLM inference services to edge devices, a strategy currently being explored by industry leaders such as Apple, Google, Qualcomm, Samsung, and others. Beyond alleviating the computational strain on servers and enhancing system scalability, deploying LLMs at the edge offers additional advantages. These include real-time responses even in the absence of network connectivity and improved privacy protection for customized or personal LLMs. This article delves into the challenges and potential bottlenecks currently hindering the effective deployment of LLMs on edge devices. Through deploying the LLaMa-2 7B model with INT4 quantization on diverse edge devices and systematically analyzing experimental results, we identify insufficient memory and/or computing resources on traditional edge devices as the primary obstacles. Based on our observation and empirical analysis, we further provide insights and design guidance for the next generation of edge devices and systems from both hardware and software directions
more » « less
Full Text Available
Deep Machine Learning on Segmenting and Classifying Crop Images Taken by Unmanned Aerial Vehicle

https://doi.org/10.1109/BigData59044.2023.10386741

Lo, Dan Chia-Tien; Deng, Bobin; Shi, Yong (December 2023, IEEE)
Serverless-DFS: Serverless Federated Learning with Dynamic Forest Strategy

https://doi.org/10.1145/3638837.3638865

Deng, Bobin; Zhang, Xinyue; Lo, Dan Chia-Tien (December 2023, ACM)

Search for: All records